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ABSTRACT 


In this paper, we develop simple models to study the per- 
formance of BitTorrent, a second generation peer-to-peer 
(P2P) application. We first present a simple fluid model 
and study the scalability, performance and efficiency of such 
a file-sharing mechanism. We then consider the built-in in- 
centive mechanism of BitTorrent and study its effect on net- 
work performance. We also provide numerical results based 
on both simulations and real traces obtained from the In- 
ternet. 


Categories and Subject Descriptors 
H.1.0 [Information Systems]: Models and Principles 


General Terms 


Performance 
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1. INTRODUCTION 


Peer-to-Peer (P2P) applications have become immensely 
popular in the Internet. Traffic measurements shows that 
P2P traffic is starting to dominate the bandwidth in cer- 
tain segments of the Internet [2]. Among P2P applications, 
file sharing is perhaps the most popular application. Com- 
pared to traditional client /sever file sharing (such as FTP, 
WWW), P2P file sharing has one big advantage, namely, 
scalability. The performance of traditional file sharing ap- 
plications deteriorates rapidly as the number of clients in- 
creases, while in a well-designed P2P file sharing system, 
more peers generally means better performance. There are 
many P2P file sharing programs, such as Kazza, Gnuttella, 
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eDonkey /overnet, BitTorrent, to name a few. In this paper, 
we develop simple models to understand and study the be- 
havior of BitTorrent [8] which is proving to be one of the 
more popular P2P applications today. 

For a BitTorrent network (or a general P2P file sharing 
network), several issues have to be addressed in order to 
understand the behavior of the system. 


e Peer Evolution: In P2P file sharing, the number of 
peers in the system is an important factor in deter- 
mining network performance. Therefore, it is useful to 


s 
e Scalability: To realize the advantages of P2P file shar- 
ing, it is important for the network performance to not 


deteriorate, and preferably to actually improve, as the 
size of the network increases. 


e File Sharing Efficiency: It is common for peers in a 
P2P network to have different uploading/downloading 
bandwidths. Further, in BitTorrent-like systems, a file 
may be broken into smaller pieces and the pieces may 
be distributed at random among the peers in the net- 


work. To efficiently download the file, it_is im 
to 1 Gaga TEBE ESE ENTS E 


i 
that it needs and further, to ensure that the download- 
ing bandwidth of each peer is fully utilized. 


e Incentives to prevent free-riding: Free-riding is a ma- 
jor cause for concern in P2P networks. Free-riders are 
peers who try to download from others while not con- 
tributing to the network, i.e., by not uploading to oth- 
ers. Thus, most P2P networks try to build in some 
incentives to deter peers from free-riding. 

h 


t m. Thus, it is 
important to study the effect of such behavior on the 
network performance. 


1.1 Relationship to prior work 


The basic idea of P2P network is to have peers participate 
in an application level overlay network and operate as both 


servers and clients. Since the service burden is distributed to 
all participating peers, the system is expected to scale well 
even when the network is very large. Besides file sharing, 
P2P overlays have also been deployed in distributed direc- 
tory service [18, 21], web cache [15], storage [9], and grid 
computation [1] etc. 

While early work on P2P systems has mainly focused on 
system design and traffic measurement [19, 20, 17], some re- 
cent research has emphasized performance analysis. In [13], 
a closed queueing system is used to model a general P2P 
file sharing system and basic insights on the stationary per- 
formance are provided. In [6, 7], a stochastic fluid model is 
used to study the performance of P2P web cache (SQUIR- 
REL) and cache clusters. A part of our work is motivated by 
the models in [11, 24], where a branching process is used to 
study the service capacity of BitTorrent-like P2P file shar- 
ing in the transient regime and a simple Markovian model 
is presented to study the steady-state properties. Our work 
differs from [11, 24] in the following respects: 


e Instead of studying the Markov chain numerically, we 


develop a simple deterministic model which allows us 


r 
into the performance of 


the P2P network. We also incorporate wealistié sce) 


e Then, we develop a siiiiplestoohasticifiiid MOG Wich 


fluid model. — 
e We also develop a simple model to study the efficiency 
of downloading from other peers and argue that the 


file-sharing protocol in BiTorrent is very efficient. 


e Finally, we consider the mechanisms built into BitTor- 
rent to avoid free-riding and study the impact of these 
mechanisms on the users’ behaviors and network per- 
formance. 


2. A BRIEF DESCRIPTION 
OF BITTORRENT 


BitTorrent is a P2P application whose goal is to facili- 
tate fast downloads of popular files. Here we provide a brief 
description of how BitTorrent operates when a single file is 
downloaded by many users. Typically the number of simul- 
taneous downloaders for popular files could be of the order 
of a few hundreds while the total number of downloaders 
during the lifetime of a file could be of the order of several 
tens or sometimes even hundreds of thousands. The basic 
idea in BitTorrent is to divide a single large file (typically a 
few 100 MBytes long) into pieces of size 256 KB each. The 
set of peers attempting to download the file do so by con- 
necting to several other peers simultaneously and download 
different pieces of the file from different peers. 

To facilitate this process, BitTorrent uses a centralized 
software called the tracker. In a BitTorrent network, a peer 
that wants to download a file first connects to the tracker of 
the file. The tracker then returns a random list of peers that 
have the file. The downloader then establishes a connection 
to these other peers and finds out what pieces reside in each 


of the other peers. A downloader then requests pieces which 
it does not have from all the peers to which it is connected. 
But each peer is allowed to upload only to a fixed num- 
ber (default is four) at a given time. 

eias BitTorrent. Which peers to unchoke is deter- 
mined by the current downloading rate from these peers, 
i.e., each peer uploads to the four peers that provide it with 
the best downloading rate even though it may have received 
requests from more than four downloaders. This mechanism 
is intended to deter free-riding. Since a peer is only upload- 
ing four other peers at any time, it is possible that a peer, 
say Peer A, may not be uploading to a peer, say Peer B, 
which could provide a higher downloading rate than any of 
the peers to which Peer A is currently uploading. There- 
fore, to allow each peer to explore the downloading rates of 
other peers, BitTorrent uses a process called optimistic un- 
choking. Under optimistic unchoking, each peer randomly 
selects a fifth peer from which it has received a downloading 
request and uploads to this peer. Thus, including optimist 
unchoking, a peer may be uploading to five other peers at 
any time. Optimistic unchoking is attempted once every 30 
seconds and to allow optimistic unchoking while keeping the 
maximum number of uploads equal to five, an upload to the 
peer with the least downloading rate is dropped. 

BitTorrent distinguishes between two types of peers, namely 
downloaders and seeds. Downloaders are peers who only 
have a part (or none) of the file while seeds are peers who 
have all the pieces of the file but stay in the system to al- 
low other peers to download from them. Thus, seeds only 
perform uploading while downloaders download pieces that 
they do not have and upload pieces that they have. Ideally, 
one would like an incentive mechanism to encourage seeds 
to stay in the system. However, BitTorrent currently does 
not have such a feature. We simply analyze the performance 
of BitTorrent as is. 

In practice, a BitTorrent network is a very complicated 
system. There may be hundreds of peers in the system. 
Each peer may have different parts of the file. Each peer 
may also have different uploading/downloading bandwidth. 
Further, each peer only has partial information of the whole 
network and can only make decisions based on local infor- 
mation. In addition, BitTorrent has a protocol (called the 
rarest-first policy) to ensure a uniform distribution of pieces 
among the peers and protocols (call the endgame mode) to 
prevent users who have all but a few of the pieces from 
waiting too long to finish their download. As with any good 
modelling exercise, we tradeoff between the simplicity of the 
model and its ability to capture all facets of the protocol. 
Thus, we will first use a simple fluid model to study the scal- 
ability and the stability of the system. We will then assume 
that each peer has the global information and study the 
incentive mechanism of BitTorrent. We will finally briefly 
study the effect of the optimistic unchoking on free-riding. 


3. A SIMPLE FLUID MODEL 


Our model for file-sharing is 
a However, while [11] only uses the model to develop a 
which is then studied numerically, we use(the 


ti S 


e. 
In our model, we use the following quantities to capture a 
BitTorrent peer-to-peer network [8] that serves a given file 


(without loss of generality, we assume that the file size is 1): 


aW umber of downloaders (also known as leechers) in the 


system at time t. 


GY) number of Seedsyin the system at time t. 
Ge larival PatOPewiRequests. We @ssitifie that peers 


arrive according to a Poisson process. 


i i eer. We assume that 
all peers have t 


ing, which we will 
describe shortly. 7 takes values in [0, 1]. 


In a BitTorrent-like P2P network, a downloader can up- 
load data to other peers even though it may only have parts 
of a file. The parameter 7 is used to indicate the effectiveness 
of this file sharing. 


. If7 = 0, then the downloaders do 
ata to each other and only download from seeds. 


not uploa 


t 


To obtain a Mark6viamgdescription of the system, we 


ese assumptions can be easily relaxed to allow more gen- 
eral distributions for all the random variables involved by 
using phase-type distributions as in [14, 22, 10]. 

Next, we comment on the parameters 0 and y. A down- 
loader may not stay in the system till it completely down- 
loads the file. Occasionally, 


thatjtheWdownloadlisitaking too long. We assume that @achD 


aean 1/0) Equivalently, 


abort theinidownload andileavethesystém. In a fluid model, 
theiratelofideparturesiof downloaders will be given be 


min{ex(t), (ne (t) + y(t))} + Oa(t). 
d 
anly, th 
. We 


t 
tributed|withtmcan1/4. Clearly, y will have an effect on 


system performance: the lower the y, the lower the down- 
load times since this means that there will more seeds in 
the system. This parameter 

vi y 


, after they have become seeds. 
However, Bit: aiaietant aise Eee onan, 
and therefore, we simply consider + to be a fixed constant. 


Now, we are ready to describe the evolution of x and 
y based on the above 


along with the obvious constraint that 
G aa] A key contribution of [11] was to describe 
e efficiency of data transfer from other downloaders using 
the parameter 7. Our fluid model provides a simple descrip- 
tion of the system that was described by a Markov chain in 
[11]. In addition, we have incorporated other realistic sce- 
narios such as departures of downloaders due to impatience 
with the downloading process (described by 0) and down- 
loading bandwidth constraint c. In a later subsection, we 
will also present a simple stochastic fluid model that char- 
acterizes the variability around the fluid model. We now 
study the steady-state performance of the P2P system us- 
ing the above fluid model. 


3.1 Steady-State Performance 


CORE OR 
aD 

in (1) and obtain 
0 = A-GE— min{cz, u(n? + 9H)}, 


min{cZ, u(nz + y)} — yy(t), (2) 


where inane. of z(t) and y(t) 


respectively. 


i.e., cz < (nī + J). Equa- 


an ) then becomes a simple nesi equation. Solving the 
ee ay we have 


Instead, if we assume that the uploading bandwidth is the 


), we get 


À 
v(1 + 2) 
À 
1E) 


constraint, i.e., 


). From c7 > u(n + J), we have 


3l=e 
“~ 
P 
af 


Define hen (3) and (4) can be 


combined to yield 


Recall that 3 = max{Ż, ¿(4 — +)}- Equation (6) provides 


c? 


several insights into the behavior of BitTorrent: 


A is because the 
e When q increases, T increases because a larger y means _ 
e . However, once 


mg because 


e It is often true that the downloading bandwidth c of a 
peer would be much higher than its uploading band- 
width u. Common examples of such an asymmetry are 
DSL and cable modem connections. For performance 


analysis purposes, 
as in [11, 24]. 


We briefly comment , which means that 
downloaders do not upload data to each other and only 
download from seeds. If y < pu, the previous analysis as 
in the case of 7 > 0 still holds and T = 1/c. On the other 
hand, if y > u, from (1), we can see 

dy(t) 


ae < (u— y)y(t). 


. So, it is very 
important for the downloaders to upload data to each other. 


(6), we also see that 
| the next subsection, we will derive an expression 
r 7 and argue that 7 is very close to 1 in BitTorrent. 


3.2 Effectiveness of File Sharing 


In this section, we present a simple model to calculate 
the value of 7, which indicates the effectiveness of the file 
sharing. For asgi j 


connected to ityithenl peer willjuploadidata: We then have 


downloader 7 has no piece that 
the c 


7=1-P 


en 


downloader 7 needs no 
piece from downloader i 


n=1-P{ 


where j is a downloader connected to i. 
For each downloader, we 


denote the 


i. We assume that gi 
. This is a rea- 
sonable assumption because BitTorrent takes a rarest first 
piece selection policy when downloading. Under these as- 
sumptions, we have 


‘ 


= P{j has all pieces of downloader i} 


downloader j needs no 
piece from downloader i 


N-1 nj 


1 
= 5 5 yP U has all pieces of iļni, nj} 


nj=1 n;=0 


and 


Now, we will interpret the expression for realistic file sizes. 


In BitTorrent, each piece is typically 256K B. a? 
: e 


. This tells 
us that BitTorrent is very efficient in sharing files. 


ote that, sincéikidepenids om 


, it may be related to 


. Thus, our observation in 
the previous subsection that the network performance is es- 
sentially independent of A still holds. This also matches the 
observations of real BitTorrent networks presented in [11, 
24]. Note that when k = 0, the downloader is not connected 
to any other downloaders and hence 7 = 0. 


3.3 Local Stability 


When deriving the steady-state quantities z, y and T, we 
implicitly assumed that the system is stable and will reach 
its equilibrium. In this section, we study the stability of the 
fluid model (1) around the equilibrium {Z, 9}. 


Let 


no[n a 


Then t 


Y? + (un +o +y- uy tuny +O- u)=0. (9) 


Since 4 < C +), we have y > p. When ņ > 0, both 


da(t 
BO = calt)— auld). 
Let 
_ | -(@+c) 0 
A2 = | 2 SOE (10) 
Then the eigenvalues of A2 satisfy 
Y? + (0+ +e) + (0 +c) =0. (11) 


Again, since both Ø + y+ c and (Ø + c)y are greater than- 


C 
e case is a little more tricky since 


the 


Gea w be an eigenvalue of Ai. The eigen- 
re the solutions of 


. To avoid lengthy arguments, we 


Even in the cases where + 4 + C= 


= ae the global stability 
of the fluid model (1) may be hard to analyze because of 
the fact that the i 

are called 
s; we refer the reader to the survey in 


[16] for the stability issues associated with such models. 


3.4 Characterizing Variability 


However, it 
is important to understand how the number of seeds and 
downloaders vary around the numbers predicted by the de- 
terministic model. In this subsection, we present a simple 
characterization of the variance of x and y around Z and y 
using a Gaussian approximation. 

_ Under the assumptions that we have discussed in Sec- 


hose solution 


is known as the Ornstein-Uhlenbeck process: 


dX(t) = AX (12) 


In (12), the components of W are independent standard 
Wiener processes (Brownian motions), 


), with @heyentriestof 


A = Ai given by UFS 7 —=) and A= Av given by 


(10) if + > iG — 3). In both cases, we have 


B=| { =v- e) 0 | (13) 
0 va- -=-vl-p) 
where p := TT 
which is 
From (12), it is easy to compute the i- 


ie, © = lim: E(X(t)X7 (t)). This is given by 
the so-called Lyapunoyjeqiation [3] 


AD +S A? + BB =0. (14) 


The steady-state variance of ĉ is then given by (1,1) ele- 
ment of © and the steady-state variance of ĝ is given by 
the (2,2) element of X. The above result essentially states 
that, i 


is distri : : A 
a 
e formal proof required to establish (12) is beyond the 


scope of this paper. We will simply state here that it involves 
showing that the original stochastic process converges to 
the deterministic and stochastic differential equation limits 
when the arrival rate goes to oo. This can be established 
using weak-convergence theorems such as the ones in [5, 12, 
23). 


4. INCENTIVE MECHANISM 


In this section, we discuss the algorithm in BitTorrent 
which is intended to discourage free-riding. We first describe 
the algorithm and then study the optimal selfish behavior 
of the users under this algorithm. 


4.1 Peer Selection Algorithm 


There is a built-in incentive mechanism in BitTorrent to 
encourage users to upload. The basic idea is that each peer 
uploads to nu peers from which it has the highest down- 
loading rates (the default value of nu is 4). But since a peer 
only has partial information of the whole network (i.e., it 
doesn’t have the upload rate information of all peers), opti- 
mistic unchoking [8] is used to explore the network. In this 
section, our objective is to understand how the built-in in- 
centive mechanism affects the network performance. Hence, 


Sion sels | | a | . 
er the above assumptions, we can simplify the peer 


selection algorithm of BitTorrent as follows. We first sort 
the peers according to their uploading bandwidth (it could 
be the physical uploading bandwidth or the uploading band- 
width that has been set manually by the user) such that the 
first peer has the highest uploading bandwidth. If two or 
more peers have the same uploading bandwidth, they are 
randomly ordered. The peer selection process proceeds in 
steps with peer i choosing peers to upload at step i. In the 
real BitTorrent, the peer selection does not proceed in steps 
like this. However, after we describe the selection algorithm, 
it would be clear that the step-by-step selection process does 
not change the selection of the peers significantly. Let N be 
the total number of peers and let u; be the uploading band- 
width of peer i. Then at step i, peer i selects peers to upload 
according to the following rules. 


1. If peer i is selected by peer j (j < i), then i selects j. 
For any peer k (k > i), let ni, be the number of peers 
that have selected peer k prior to step i. 

2. If ni < ny and nu — ni < N —i, peer i selects nu — ni 

peers from the set {k|k > i} using the following set 

of rules to prioritize a peer, say k1, over another peer 
k2: 


(a) If x1 > Mee, select k1. 

(b) If uki = k2 and ni, < nie, select k1. 

(c) If uki = pre, ni, = Nio, and kl < k2, select k1. 
3. Ifni < nu and nu—n? > N—i, peer i selects all peers in 


{k|k > i} and also randomly selects (nu — n4) — (N — i) 
peers from the peers that i has not selected yet. 


These rules are easy to understand! 


will show in Lemma 1 that ni < nu. So rule 1 will not 
violate the requirement that the number of uploads cannot 
exceed Nu- i 


aeRO ECS The 
ollowing lemma is a simple property of the peer selection 


algorithm. 


LEMMA 1. 


Proof: First, when i = 1, ni = 0 < ny and ni, = ni, = 
0 < nu, the lemma is true. Now, we assume that the lemma 
is true for peer 7 and prove that it is also true for peer i + 1 
and hence by induction, it is true for all i. 

If the lemma is true for i, we will have niyi <ni< Ny and 
nig < ni, < nu for any k2 > kl >i+1. Now, if nj = nu, 
then peer i already has nu uploads and it will not select any 
peer from {k|k > i}. Hence, for any k > i, ni = nit! and 
the lemma is true for i+ 1. If ni < nu, then Ni41 < Nu. So, 
no man whether peer 7 selects peer i+ 1 or not, we always 
have nit fi < nu. To show the second part of the lemma, if 
ni, > nio, after peer i makes the selection, we always have 
nit > in If nki = nko, according to rule 2, we also have 
i . Hence the lemma is true for i + 1. E 

Now let D; be the set of peers that select peer i. We 
exclude peers that randomly select i by using rule 3 here for 
two reasons. First, each peer i has about equal chance to 
be selected and hence on average, the effect of the random 
selection can be equivalently seen as each peer getting a 
constant download rate dy. Secondly, if the number of peers 
is large, dr will be very small and can be ignored. The 
aggregate downloading rate of peer 7 then is 


a= Yin. 


kED; 


Note that if two peers have the same uploading bandwidth, 
they may get different downloading rates. Generally, if pi = 
Mit1 = +++ = uj are peers with the same uploading band- 
width, we will have dj > di41 >--- > dj. So, for a given 
peer i, the downloading rate not only depends on the up- 
loading bandwidth u;i, but also depends on how the peer is 
ordered with regards to other peers with the same upload- 
ing bandwidth. To eliminate the ambiguity, when there are 
two or more peers with the same uploading bandwidth p, 
we define the downloading rate of these peers to be 


1 j 
d(u) = j-itl > dk, (15) 


where i (resp. j) is the first (resp. last) peer with uploading 
bandwidth u. Moreover, we have the following lemma when 
Nu > 2. 


LEMMA 2. 


1. di 2 dit [EB > d; > de, 
2. di> de. 
mays ae 


Proof: First, from the peer selection rules, it is easy to 
see that for any two peers k1 < k2, dkı > dk2. So condition 


1 is obviously true. Now, to prove condition 2, we only need 
to prove di > dj41. When peer 7 selects peers, if ni = ny 
(i.e., peer i has already been selected by nu peers that have 
uploading bandwidth greater than u), then d; > u. If ni < 
Nu, peer i will select nu — ni peers from i+1,--- ,j. So, we 
always have d; > p. 

Now, when peer m (i < m < j — nu) selects peers, if 
Nm+1 2 1, obviously we will have nti > 1. If nmp = 0, 
since peers m and m+1 have the same uploading bandwidth, 
from peer selection rule 2(b), we have nm < 1 < ny and m 
will select peer m+ 1. Hence n™t} = 1. In both case, we 
have n™t} > 1. 

When m + 1 selects peers, since Ws Si Tiu =A < 
Nu — 1. From m + 2 to j, we have more than nu — 1 peers 
with uploading bandwidth u. So, m + 1 will not select peer 
j+1 and peer j +1 can at most be selected by Nnu — 1 peers 
with uploading bandwidth u. So 


1 
dj+ı < > ((nu — Yau + pj+2) <M < di. 
Since dj41 > dx for any k > j + 1, we have d; > dx. 

From the condition 2 and the definition of d() (15), it is 
easy to see that d(u) > dx and we are done. E 


Now, we have defined the We will 
next study how these rules 
4.2 Peer Strategy 


The objective of the incentive mechanism is to encourage 
users to contribute. In BitTorrent, the uploading bandwidth 
can be chosen by each user up to a maximum of the physical 
uploading bandwidth. The purpose of the rest of this section 
is to study how the incentive mechanism will affect the peer 
strategy, i.e, how the users set their bandwidth. Let p; be 
the physical uploading bandwidth of peer i and let {u—;} be 
the set of uploading bandwidth chosen by the peers except 
hi. Let di(Hi, ui) be the aggregate downloading rate of peer 
i when the uploading bandwidth of peer i is ui. When {u—:} 
is given, it is obvious that d; is a non-decreasing function of 


Intuitively, we may Sees coe i Such m 


hi = min{ fii|di (fi, p—i) = di (pi, w-i) }- (16) 


But unfortunately, the minimum of the set {/i:|di(fii, u-i) = 
di (pi, u-i)} may not exist (e.g., for the set (4, 6]). If we take 
this into account, 


“ps = min {int jaldi(jis, us) = dvi} eps}, (17) 
The parameter/é can) be 


Note that even if the minimum of 
{ildi (ui, ui) = di(pi, u—i)} exists, it is still better to add 
a small number £. Because if the uploading bandwidth of 
two peers are very close, we may not be able to detect the 
difference between them. Hence, adding a small positive 
number can help differentiate peer i from other competing 
peers. 


wher 


equilibrium point for the system. While there may be no 


Given the peer selection algorithm (game rules 


Bi = min {inf {fs |di (fi, Ai) = di(pi, H-i)} + €, pi} - 

Let’s consider a small BitTorrent network with 6 peers. 
The number of uploads na = 4 for all peers. We will show 
that if the peers have different physical uploading bandwidth 
and the minimum uploading bandwidth min{p;} > 2e, there 
is no Nash equilibrium point for the system. In this simple 
example, we can see that if the uploading bandwidth u; of 
peer 7 is less than those of all other peers, then peer 7 will 
get zero downloading rate because the other five peers will 
upload to each other and not to peer 7. On the other hand, 
once y; is greater than the uploading bandwidth of at least 
one peer, peer 7 will get the same downloading rate even 
if ui < pi. So the strategy for peer i (17) in this exam- 
ple turns out to setting jz; such that it is the fifth highest 
uploading bandwidth. Now, assume that there is a Nash 
equilibrium point {ji;} and we sort the peers by their up- 
loading bandwidth such that fi; is the highest uploading 
bandwidth. Then we have fis > fig. Otherwise, if they are 
equal, since the two peers have different physical uploading 
bandwidth, there is at least one peer with u;i < p;i and this 
peer can increase its uploading bandwidth to increase its 
download rate. Now, if fis > fis, we know that peer 6 gets 
a zero downloading rate. Since {ji;} is a Nash equilibrium, 
given {ji_¢}, the maximum downloading rate that peer 6 
can get is also zero. Hence, from (17), we have fig = €. 
Now, if fig = £, from ( 17), we have fis = 2e < min{p;}. If 
jis < min{p;}, peer 6 can increase its uploading bandwidth 
such that u6 > As, which contradicts the fact that jis is the 
fifth highest uploading bandwidth. Hence, there is no Nash 


show in the next subsection. 


4.3 Nash Equilibrium Point 


We consider a network with a finite number of groups of 


peers. In group j, all peers have the same physical uploading 
bandwidth pj. Note that this is in fact a good model for 
CD, who have only a finite number 
of network access methods (dial-up, dsl, cable modem, etc). 
Let gj be the set of peers in group j and ||g;|| be the number 


of peers in group j. Without loss of generality, we also 
assume pı > p2>---. 


PROPOSITION 1. 


Proof: We first prove that {fii} is a Nash equilibrium 
point. To prove this, we only need to prove that for any peer 
i, if Hi < Hi, then di (pi, BP-i) = di (fli, ti). Without loss of 
generality, we assume that i € gj. Since ||g;|| > nu +1, if 
we set ui < fii = pj, there will be still at least nu + 1 peers 
with uploading bandwidth p;. From Lemma 2, it is easy to 
see that di (pi, H-1) < di(Hi, Hi). 


To prove convergence, we first consider the first group 
gı. Let v™ be the (nu + 1)th highest uploading bandwidth 
after m rounds of iterations. Then v? is the (nu + 1)th 
highest uploading bandwidth of the initial set {u9}. If after 
m rounds, v™ +e < pi, then in the m + 1 round, any peer 
i € gi will increase its uploading bandwidth to pi > v™ +e 
to maximize its downloading rate. Since ||gi|| > mu+1, after 
the m + 1 round, we will have vt! > v™ +e. The increase 
in v™ will continue until v™ = pi and the peers cannot 
increase their uploading bandwidth anymore. In this case, 
any peer i € gı will have the uploading bandwidth pi = pi. 
Once peers in the first group reach their maximum limit, 
they will not change their uploading bandwidth anymore. 
We can now use a similar argument to prove that peers in 
the second group will also reach the Nash equilibrium point. 
Continuing in a similar fashion, we can establish that the 
whole system converges to the Nash equilibrium point. E 


5. OPTIMISTIC UNCHOKING 


In Section 4, we assume that each peer knows the up- 
loading bandwidths of all other peers. In reality, each peer 
only has the rate information about peers from which it is 
downloading. Hence optimistic unchoking is used to explore 
the network and obtain information about other peers. In 
this section, we briefly study the effect of optimistic un- 
choking on free-riders. Specifically, while in Section 4.3, we 
showed that rational users would set their uploading rate to 
be equal to the maximum possible limit, here we will show 


that the maximum downloading rate that an irrational user 


5.1 Free-Riding 


Free-riding means that a peer does not contribute any- 
thing to the system, while it attempts to obtain service (or 
downloading) from other peers. If peers have global informa- 
tion, the free-riding problem can be solved by not uploading 
to peers with zero uploading bandwidth. In realit 


To illustrate it, let’s consider a 
simple example. 

We consider a network with a group of peers (g1) that have 
the same uploading bandwidth u. The number of peers in 
the group is N. We assume that each peer has nu uploads 
and one optimistic unchoking upload. Now, a new peer j 
with zero uploading bandwidth joins the network. Each peer 
i € gi will randomly choose a peer that it is not currently 
uploading to as the targe its optimistic unchoking. So, 


1 H H 
N N 
N— nunu +l nyt’ 


In this example, we see that(Because Of opumistiomichok= 


he 
choice of an optimal nu or other methods to alleviate the 
free-riding problem is a subject for further study. 


6. EXPERIMENTAL RESULTS 


We performed a series of experiments to validate the fluid 
model described in Section 3. In the first two experiments, 
we compare a simulated BitTorrent-like network and the 
fluid model. In the last experiment, we actually introduced 
a seed into the BitTorrent network, studied the evolution of 
the seeds/downloaders, and compared it to our fluid model 
results. Due to copyright reasons, we obviously could not 
introduce a very popular file into the network. However, 
as we will show in our experimental results, even for a file 
which had a total of less than 100 completed downloads, 
the match between the fluid model and the observed data is 
quite close. 


6.1 Experiment 1 
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Figure 1: Experiment 1: The evolution of the num- 
ber of seeds as a function of time 
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Figure 2: Experiment 1: The evolution of the num- 
ber of downloaders as a function of time 


In Figs 1 and 2, we compare the simple deterministic fluid 
model that we derived with the results from a discrete-event 


simulation of a BitTorrent-like network. In the discrete- 
event simulation, we use the Markov model described in 
Section 3.4. We chose the following parameters for this sim- 
ulation: = 0.00125, c = 0.002, 0 = y = 0.001. When the 
number of downloaders is 1, we set 7 = 0, otherwise, we set 
ņ = 1. This is in keeping with our observation regarding 
the efficiency of the download as described in Section 3.2. 
Initially, there is one seed and no downloader. We also keep 
the number of seeds no less than one during the entire sim- 
ulation. We change the arrival rate À from 0.04 to 40 and 
plot number of seeds/downloaders normalized by the arrival 


u(t) 
A 


rate, i.e., and don from both simulations and the fluid 


model. In this experiment, since y < u, we know that down- 
loading bandwidth is the bottleneck. From the figures 


ene C hich shows 
that the system scales very well. In other words, 


6.2 Experiment 2 
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Figure 3: Experiment 2: The evolution of the num- 
ber of seeds 


In Figs. 3 and 4, we have the same setting as the first 
experiment, except that now we set y = 0.005. With the 


We also plot 


aT nainm of ĉ and ĝ in Figs. 5 and 6, 


p — Zsim(t) — x(t) 
&(t) = a 


and 
p = Hml) = ult) 
VA 
where Zsim(t) and Ysim(t) are the number of downloaders 
and seeds respectively in the actual simulation and z(t) and 


y(t) are the number of downloaders and seeds in determin- 
istic fluid model. From the theory presented in Section 3.4, 


I 


Again, we see that the simple fluid model is cl re 
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Figure 4: Experiment 2: The evolution of the num- 
ber of downloaders 
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Figure 5: Experiment 2 : Histogram of the variation 
of the number of seeds around the fluid model 


we expect the histograms to look roughly Gaussian and this 
fact is borne out by the figures for sufficiently large A. We 
can see that the variance of ĉ and g do not change much 
when A changes from 0.04 to 40. 


6.3 Experiment 3 


In this experiment, we introduced a file into the BitTor- 
rent network and collected the log files of the BitTorrent 
tracker for a time period of around three days. When a 
peer joins/leaves the system or completes the download, it 
reports the event to the tracker. In addition, peers regu- 
larly report information such as the total amount of data 
uploaded/downloaded so far, the number of bytes that still 
need to be downloaded, etc. The tracker keeps all the infor- 
mation in the log files. Hence, we can analyze the tracker 
log files and retrieve useful information. The parameters 
à, 0, and y can be measured by counting the peer arrival, 
the downloader departure, and the seed departure respec- 


tively. However, from the tracker log files, we cannot deter- 
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Figure 6: Experiment 2 : Histogram of the varia- 
tion of the number of downloaders around the fluid 
model 


we assume that 7 = 1). The size of the file that was in- 
troduced was around 530M B. The average uploading band- 
width was estimated to be 90kb/s. We use 1 min as the time 
unit to calculate arrival rates, departure rates, etc. The nor- 
malized uploading bandwidth (normalized by the file size in 
bytes) was estimated u = 0.0013. The downloader leaving 
rate was estimated to be 0 = 0.001. A f 


i.e., there will be no one to download from. 

From the tracker logs, we estimate that, for t < 800min, 
A = 0.06 and y = 0.001. When t > 1300min, A = 0.03 and 
y = 0.0044. i 

our fluid model simulation, for time between 
800men and 1300min, we let A and y change linearly. We 
also set the downloading bandwidth c = 1 for the fluid model 
simulation (note that the actual value of c will not affect the 
fluid model results if it is above a certain threshold). 

The simulation results are shown in Figs 7 and 8. The 
real trace is measured from the tracker log file and the fluid 
model is calculated by using the above measured parame- 
ters. For the fluid model, we also numerically calculate the 
standard deviation from the steady state network parame- 
ters by using (14) and plot the error bar for 95% confidence 
intervals. From Fig. 7, we see that the fluid model captures 
the evolution of the number of seeds well. In Fig. 8, the 
oscillation of the number of downloaders is more significant. 
This is because that the file is not very popular and the 
arrival rate is small. Hence, our model is only an approx- 
imation of the real network. But despite this, we can see 
that the oscillation is within the level suggested by the 95% 
confidence interval. 


7. CONCLUSIONS 


In this paper, we first presented a simple fluid model for 
BitTorrent-like networks and studied the steady-state net- 
work performance. Specifically, we obtained expressions for 
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Figure 7: Experiment 3 : Evolution of the number 
of seeds 
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Figure 8: Experiment 3 : Evolution of the number 
of downloaders 


the average number of seeds, the average number of down- 
loaders, and the average downloading time as functions of 
the peer arrival rate, downloader leaving rate, seed leav- 
ing rate, uploading bandwidth, etc, which explicitly give 
us insight on how the network performance is affected by 
different parameters. We also characterized the variability 
of the system by applying limit theorems to the stochastic 
model when the arrival rate is large. We then abstracted 
the built-in incentive mechanism of BitTorrent and studied 
its effect on network performance. Under certain conditions, 
we proved that a Nash equilibrium exists, under which each 
peer chooses its physical uploading bandwidth to be equal to 
the actual uploading bandwidth. We also briefly discussed 
the effect of optimistic unchoking on free-riding. Our experi- 
mental results show that the simple fluid model can capture 
the behavior of the system even when the arrival rate is 
small. 
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